Stela : on - Demand Elasticity in Distributed Data

نویسندگان

  • Indranil Gupta
  • Boyang Peng
  • Boyang Jerry Peng
چکیده

Big data is characterized by volume and velocity [24], and recently several real-time stream processing systems have emerged to combat this challenge. These systems process streams of data in real time and computational results. However, current popular data stream processing systems lack the ability to scale out and scale in (i.e., increase or decrease the number of machines or VMs allocated to the application) efficiently and unintrusively when requested by the user on demand. In order to scale out/in, a critical problem that needs to be solved is to determine which operator(s) of the stream processing application need to be given more resources or taken resources away from, in order to maximize the application throughput. We do so by presenting a novel metric called "Expected Throughput Percentage" (ETP). ETP takes into account not only congested elements of the stream processing application but also their effect on downstream elements and on the overall application throughput. Next, we show how our new system, called Stela (STream processing ELAsticity), incorporates ETP in its scheduling strategy. Stela enables scale out and scale in operations on demand, and achieves the twin goals of optimizing post-scaling throughput and minimizing interference to throughput during the scaling out/in. We have integrated the implementation of Stela into Apache Storm [27], a popular data stream processing system. We conducted experiments on Stela using a set of micro benchmark topologies as well as two topologies from Yahoo! Inc. Our experiment results shows Stela achieves iii 45% to 120% higher post scale throughput comparing to default Storm scheduler performing scale out operations, and 40% to 500% of throughput improvement comparing to the default scheduler during scale in stage. This work is a joint project with Master student Boyang Peng [1]. iv For Mom and Dad, who give me all they have. v ACKNOWLEDGMENTS I would like to thank my advisor, Indranil Gupta, to provide invaluable support, inspirations, and guidance for my research during my study. I am also very grateful to him for his millions of suggestions and corrections to help me to improve my writing skill. I would like to thank Boyang Jerry Peng for his collaboration in this project [1]. This work will not be possible without them. I would also like to express my sincere gratitude to all former and current members of Distributed Protocols Research Group (DPRG), for their constant support like a family. Working with them has …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stela : on - Demand Elasticity in Distributed Data Stream Processing Systems

Big data is characterized by volume and velocity [24], and recently several real-time stream processing systems have emerged to combat this challenge. These systems process streams of data in real time and computational results. However, current popular data stream processing systems lack the ability to scale out and scale in (i.e., increase or decrease the number of machines or VMs allocated t...

متن کامل

Elasticity and Resource Aware

The era of big data has led to the emergence of new systems for real-time distributed stream processing, e.g., Apache Storm is one of the most popular stream processing systems in industry today. However, Storm, like many other stream processing systems, lacks many important and desired features. One important feature is elasticity with clusters running Storm, i.e. change the cluster size on de...

متن کامل

Total Electricity Demand Modeling: An Application of Spatial Panel Econometric Method

This paper aims to model total electricity demand (incremental) in order to estimate price and income elasticities using provincial data and the spatial panel data method. Electricity demand at the province level is influenced by climatic zones, which can be divided into temperate, cold and sub-tropical. This paper uses time series data for electricity demand in Iran’s 28 provinces, taking into...

متن کامل

Demand Response Based Model for Optimal Decision Making for Distribution Networks

In this paper, a heuristic mathematical model for optimal decision-making of a Distribution Company (DisCo) is proposed that employs demand response (DR) programs in order to participate in a day-ahead market, taking into account elastic and inelastic load models. The proposed model is an extended responsive load modeling that is based on price elasticity and customers’ incentives in which they...

متن کامل

Effects of Trip Purpose on Transit Fare Elasticiy, Case Study of Isfahan

This paper explores the effects of trip purpose on price elasticity of bus mode. Data for the research was collected through passenger field survey in Isfahan, Iran. Due to the nature of the data, nonlinear regression and nonparametric statistics tools were used for analysis. It was found that the logarithmic function best explains the relationship between percentage of change in demand and per...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015